The file consists of text lines. Each line corresponds to one article. The file is normally kept sorted in the order in which articles are received, although this is not a requirement. Innd(8) appends a new line each time it files an article, and expire(8) builds a new version of the file by removing old articles and purging old entries.
Each line consists of two or three fields separated by a tab, shown below as \t:
<Message-ID> \t date <Message-ID> \t date \t files
The Message-ID field is the value of the article's Message-ID header, including the angle brackets.
The date field consists of three sub-fields separated by a tilde. All sub-fields are the text representation of the number of seconds since the epoch --- i.e., a time_t; see gettimeofday(2). The first sub-field is the article's arrival date. If copies of the article are still present then the second sub-field is either the value of the article's Expires header, or a hyphen if no expiration date was specified. If an article has been expired then the second sub-field will be a hyphen. The third sub-field is the value of the article's Date header, recording when the article was posted.
The files field is a set of entries separated by one or more spaces. Each entry consists of the name of the newsgroup, a slash, and the article number. This field is empty if the article has been expired.
For example, an article cross-posted to comp.sources.unix and comp.sources.d that was posted on February 10, 1991 (and received three minutes later), with an expiration date of May 5, 1991, could have a history line (broken into two lines for display) like the following:
<312@litchi.foo.com> \t 666162000~673329600~666162180 \t comp.sources.unix/1104 comp.sources.d/7056
In addition to the text file, there is a dbz(3z) database associated with the file that uses the Message-ID field as a key to determine the offset in the text file where the associated line begins. For historical reasons, the key includes the trailing \0 byte (which is not stored in the text file).